Aside

Download a PDF of this CV

Contact

Disclaimer

Last updated on 2024-06-11.

Main

Anderson Banihirwe

I contribute to and maintain several libraries within the open source scientific Python stack, particularly around improving scalability of Python tools in order to (1) handle large scale datasets on High Performance Computing and Cloud Computing platforms and (2) move the open science paradigm forward.

Education

B.S., Computer Systems Engineering

University of Arkansas at Little Rock

Little Rock, AR

2018 - 2014

Professional Experience

Software Engineer

National Center for Atmospheric Research

Boulder, CO

current - 2018-10

  • Contributed and helped maintain the core software stack powering the Pangeo project. Software projects I contributed to include: dask, intake.
  • Developed and maintained Xarray, an open source library for working with multidimensional, labeled datasets and arrays in Python.
  • Created and maintained intake-ESM, a Python data cataloguing package for exploring and ingesting earth system model data sets.
  • Developed and delivered weekly, live (virtual and in-person), self-paced technical tutorials to NCAR scientists and their collaborators.

Software Developer Intern

Quansight

Austin, TX

2018-09 - 2018-05

  • Developed xndframes, a Pandas ExtensionDtype/Array backed by xnd, a container type that maps most Python values relevant for scientific computing directly to typed memory.
  • Worked on integrating cuDF - GPU dataframe library with Apache Arrow library.

Data Science Intern

First Orion

Little Rock, AR

2018-04 - 2017-11

  • Built scoring, predictive models with Scikit-learn, Dask, and Apache Spark using First Orion’s proprietary telecommunication data.

Research Intern

National Center for Atmospheric Research

Boulder, CO

2017-08 - 2017-05

  • Developed spark-xarray, a Python package that integrates PySpark and xarray for climate data analysis.

Selected Publications, Posters, and Talks

Building Tools for the Scientific Python Community

12th Symposium on Advances in Modeling and Analysis Using Python at 2022 AMS Annual Meeting

Online

2022-01

  • Invited Keynote talk.

The current State of Deploying Dask on HPC Systems

2021 Dask Developer Summit

Online

2021-05

  • Contributed talk.

Cloud-Native Repositories for Big Scientific Data

Computing in Science and Engineering

N/A

2020-11

  • Authored with Ryan Abernathey, Tom Augspurger, et al.

Pangeo Benchmarking Analysis: Object Storage vs. POSIX File System

5th International Parallel Data Systems Workshop at 2020 Supercomputing Conference

N/A

2020-10

  • Authored with Haiying Xu, Kevin Paul.

The Pangeo Ecosystem: Interactive Computing Tools for the Geosciences: Benchmarking on HPC

2019 Supercomputing Conference Workshop on Interactive High-Performance Computing

N/A

2020-01

  • Authored with Tina Erica Odaka, Guillaume Eynard-Bontemps, Aurelien Ponte, Guillaume Maze, Kevin Paul, Jared Baker, Ryan Abernathey.

Pangeo Use Case: Analyzing Initialized Climate Prediction System Datasets with climpred

NOAA’s 45th Climate Diagnostics & Prediction Workshop

Online

2020-10

  • Invited talk about climpred, a Python package for weather and climate forecasts.

Zarr: chunked, compressed, multidimensional arrays

2020 Cloud Native Geospatial Outreach Day

Online

2020-09

  • Invited talk about Zarr, an open source data format for the storage of chunked, compressed, multidimensional arrays.

Intake-ESM – Making It Easier To Consume Climate and Weather Data

2020 ESIP Summer Meeting

Online

2020-07

  • Invited talk about intake-esm, an intake plugin for working with Earth System Model (ESM) datasets.

Dask and Pangeo

2020 Dask Developer Summit

Washington, D.C.

2020-02

  • Invited talk.

Perceptual Judgments to Detect Computer Generated Forged Faces in Social Media

IAPR Workshop on Multimodal Pattern Recognition of Social Signals in Human-Computer Interaction

N/A

2019-01

  • Authored with Suzan Anwar, Mariofanna Milanova, Mardin Anwer.

Interactive Supercomputing with Dask and Jupyter

2019 Scientific Computing with Python conference

Austin, TX

2019-07

  • Contributed talk about Dask and Jupyter.

Beyond Matplotlib - Tutorial: Building Interactive Climate Data Visualizations with Bokeh and Friends

2018 UCAR Software Engineering Assembly conference

Boulder, CO

2018-04

  • Contributed tutorial about interactive visualization with Python.

PySpark for “Big” Atmospheric Data Analysis

8th Symposium on Advances in Modeling and Analysis Using Python at 2018 AMS Annual Meeting

Austin, TX

2018-01